11 research outputs found
Bayesian Flow Networks in Continual Learning
Bayesian Flow Networks (BFNs) has been recently proposed as one of the most
promising direction to universal generative modelling, having ability to learn
any of the data type. Their power comes from the expressiveness of neural
networks and Bayesian inference which make them suitable in the context of
continual learning. We delve into the mechanics behind BFNs and conduct the
experiments to empirically verify the generative capabilities on non-stationary
data.Comment: Submitted to NeurIPS 2023 Workshop on Diffusion Model
MM-GEF: Multi-modal representation meet collaborative filtering
In modern e-commerce, item content features in various modalities offer
accurate yet comprehensive information to recommender systems. The majority of
previous work either focuses on learning effective item representation during
modelling user-item interactions, or exploring item-item relationships by
analysing multi-modal features. Those methods, however, fail to incorporate the
collaborative item-user-item relationships into the multi-modal feature-based
item structure. In this work, we propose a graph-based item structure
enhancement method MM-GEF: Multi-Modal recommendation with Graph Early-Fusion,
which effectively combines the latent item structure underlying multi-modal
contents with the collaborative signals. Instead of processing the content
feature in different modalities separately, we show that the early-fusion of
multi-modal features provides significant improvement. MM-GEF learns refined
item representations by injecting structural information obtained from both
multi-modal and collaborative signals. Through extensive experiments on four
publicly available datasets, we demonstrate systematical improvements of our
method over state-of-the-art multi-modal recommendation methods
ICICLE: Interpretable Class Incremental Continual Learning
Continual learning enables incremental learning of new tasks without
forgetting those previously learned, resulting in positive knowledge transfer
that can enhance performance on both new and old tasks. However, continual
learning poses new challenges for interpretability, as the rationale behind
model predictions may change over time, leading to interpretability concept
drift. We address this problem by proposing Interpretable Class-InCremental
LEarning (ICICLE), an exemplar-free approach that adopts a prototypical
part-based approach. It consists of three crucial novelties: interpretability
regularization that distills previously learned concepts while preserving
user-friendly positive reasoning; proximity-based prototype initialization
strategy dedicated to the fine-grained setting; and task-recency bias
compensation devoted to prototypical parts. Our experimental results
demonstrate that ICICLE reduces the interpretability concept drift and
outperforms the existing exemplar-free methods of common class-incremental
learning when applied to concept-based models. We make the code available.Comment: Under review, code will be shared after the acceptanc
FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning
Exemplar-free class-incremental learning (CIL) poses several challenges since
it prohibits the rehearsal of data from previous tasks and thus suffers from
catastrophic forgetting. Recent approaches to incrementally learning the
classifier by freezing the feature extractor after the first task have gained
much attention. In this paper, we explore prototypical networks for CIL, which
generate new class prototypes using the frozen feature extractor and classify
the features based on the Euclidean distance to the prototypes. In an analysis
of the feature distributions of classes, we show that classification based on
Euclidean metrics is successful for jointly trained features. However, when
learning from non-stationary data, we observe that the Euclidean metric is
suboptimal and that feature distributions are heterogeneous. To address this
challenge, we revisit the anisotropic Mahalanobis distance for CIL. In
addition, we empirically show that modeling the feature covariance relations is
better than previous attempts at sampling features from normal distributions
and training a linear classifier. Unlike existing methods, our approach
generalizes to both many- and few-shot CIL settings, as well as to
domain-incremental settings. Interestingly, without updating the backbone
network, our method obtains state-of-the-art results on several standard
continual learning benchmarks. Code is available at
https://github.com/dipamgoswami/FeCAM.Comment: Accepted at NeurIPS 202
Augmentation-aware Self-supervised Learning with Guided Projector
Self-supervised learning (SSL) is a powerful technique for learning robust
representations from unlabeled data. By learning to remain invariant to applied
data augmentations, methods such as SimCLR and MoCo are able to reach quality
on par with supervised approaches. However, this invariance may be harmful to
solving some downstream tasks which depend on traits affected by augmentations
used during pretraining, such as color. In this paper, we propose to foster
sensitivity to such characteristics in the representation space by modifying
the projector network, a common component of self-supervised architectures.
Specifically, we supplement the projector with information about augmentations
applied to images. In order for the projector to take advantage of this
auxiliary guidance when solving the SSL task, the feature extractor learns to
preserve the augmentation information in its representations. Our approach,
coined Conditional Augmentation-aware Selfsupervised Learning (CASSLE), is
directly applicable to typical joint-embedding SSL methods regardless of their
objective functions. Moreover, it does not require major changes in the network
architecture or prior knowledge of downstream tasks. In addition to an analysis
of sensitivity towards different data augmentations, we conduct a series of
experiments, which show that CASSLE improves over various SSL methods, reaching
state-of-the-art performance in multiple downstream tasks.Comment: Prepint under review. Code: https://github.com/gmum/CASSL